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ABSTRACT 

In Piaget's developmental psychology the fourth and 
highest stage of human cognitive development is that of formal 
operations. The research on formal thought instruments is outlined. 
This study was designed to construct and validate paper-and-pencil 
instruments which could be used to select students capable of 
abstract conceptualization, hypothetico-deductive thought, and 
combinational reasoning. Three tests in different content areas were 
developed using item specifications found in the Piagetian literature 
on formal operations. Items were six-choice logic items with abstruse 
content. These tests, four Piagetian formal thought tasks, and a 
measure of verbal intelligence were administered to a sample of 
above-average teenagers. The formal operational reasoning tests were 
demonstrated to have substantial content validity, modest concurrent 
validity, and limited construct validity. Six item structures were 
found to have uniformly high first principal component factor 
loadings, validity indices, and reliability indices. The need for 
further research in this area is pointed out. (DG) 
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I. Introduction 

Piaget has formulated a developmental psychology which indicates an 
invariant sequence of four qualitatively-distinct stages of human cogni- 
tive development. The fourth stage — the stage of formal operations — is 
characterized by the capacity to consider all the possible relationships 
in a problematic situation and by the capaci ty to think in a hypothetico- 
deductive manner. Formal operations are intemalizable / reversible actions 
which are coordinated in an integrated system and whidi are based on pro- 
positions. 

To test for fomtil operations Piaget and his associates (Inhelder 
and Piaget, 1953) have formulated a set of experimental tasks which require 
the application of formal operations for their successful resolution. For 
exanple, a billiard game has been used '•*< test for the understanding of the 
concept of equality of angles of incidence and reflection which, in turn, 
is a manifestation of the capacity to formulate the binary operation of re- 
ciprocal implication. 

Formal thought has been measured with the following types of treasures: 
(1) Piagetian tasks; (2) verbal or numerical analogies; (3) test items 
requiring conprehension of reading passages; (4) logic items. None of 
these measures have bean strictly validated. 

In a longitodal study of four year duration Hughes (1965) tested 40 
pupils of average and be lav avere.ge scholastic ability yearly from the age 
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of 11+ years to the ago of 14+ years. Four Piagetian tasks including the 
Equilibrium in the Balance task were used. The task scores on the fourth 
testing were correlated with other test scores such as those of numerical 
analogies and non-verbal intelligence test. With a principal oonponent 
analysis all of the tasks wore found to have high correlations (.57 - ,81) 
with the first principal oonponent. 

Lovell and Butterworth (1966) tested 60 pupils with an array of mea- 
sures testing for the schema of proportion including the Equilibrium in 

I 

the Balance task and the Projection of Shadows task. From a principal 

< 

oonponent analysis of the scores they found that all of the measures cor- 
related highly with the first principal oonponent. Using an array of for- 
mal operations tasks and tests Lovell and Shields (1967) tested 50 pupils 
ranging in age from 8 to 10 years and having verbal IQ's in excess of 140 
as measured by the Wechsler Intelligence Scale for Children. Using a prin- 
cipal oonponent analysis they found that the tasks have quite high correla- 
tions with the first principal oonponent including the Equilibrium in the 
Balance task and the Colorless Chemicals task with .83 and .72 first com- 
ponent correlations respectively. From these three studies which indicate 
that Piagetian formal thought tasks have a high first principal correla- 
tions one can infer that the tasks have substantial concurrent validity. 

Research on formal operations using verbal or numerical analogies 
include a study by Lovell and Butterworth (1966) heretofore cited, English 
researcher Lunzer (1965) has argued that both verbal and numerical analogies 

4 * 

require the application of formal operational skills for second-order rela- 

« 

tions need to be recognized for the solution of tl»e analogy items t the 
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capacity to formulate second- order relations is a characteristic of formal 
thought. Lovell and Butterworth (1966) employed 20 tests (e.g., verbal 
analogies) involving proportion and showed that a oentral intellective 
ability underlies all of these tests; the capacity to understand propor- 
tions relates to the schema of proportion which is an aspect of formal 
thought! 

Research on formal operations using analogy items, though sparse, has 
given some credibility to the statement that analogy ite^s are valid mea- 
sures of formal thought. Analogy items have content validity for they 
require the recognition of second-order relations and the use of the schema 
of proportion and analogy tests have concurrent validity for they have high 
positive correlations with other measures of formal thought (e.g., tasks) . 
However, more research needs to be done on the validation of these measures 
for there is substantial query as to whether analogies test for a broaa 
enough range of behaviors proper to the stage of formal operations . 

• Studies on formal operations using test items requiring oonprehension 
of reading passages have been done by a variety of researchers (e.g., Stone, 
1966). Case and Collinson (1962), Goldnan (1965), and Hallam (1967) have 
employed reading passages in such areas as literature, religion and history. 
Hie subjects in those three studies were instructed to read the passages 
and then to answer a few questions. Hie oral responses ware recorded and 
then scored with the use of protocols indicating the qualities of responses 
proper to each of the three highest Piagotian cognitive stages. 

Mary Ann Stone (1966) used a set of three reading passages in litera- 
ture, social studies, and science respectively and with a forty-item nwl- 
tiple-choice test on each passage which demand cither recall or application 
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skills for each item. She contended that comprehension and application 
behaviors as discussed by Bloom, et ol (1956) are proper to formal thought, 
whereas recall behaviors are proper to lcwer-stage thought. She deter- 
mined that ccnpetency at application items is higher and more homogeneous 
across content areas for older pupils than for younger pupils. Though 
her contention that tests used were valid measures of formal thought i3 
highly questionable sinoe r.o form of validity was firmly established, she 
did demonstrate that ccnpetency in thinking in various content are<is (i.e., 
horizontal dacalage) increases with age. 

Research on formal operational thought using logic items has been 
sparsely done (e.g., htorf, 1957). Albert Forf (1957) who is an associate 
of Jean Piaget at Geneva stated that any problem that demands an individual 
reason deductively from a set of hypothetical premises with unary and binary 
connectives (e.g., if . . . then) is a formal operational problem. Shirley 
Ann Hill (1960) used logic items testing for the sentential logic, the 
classical syllogism, and the logic of quantification. For each item the 
subject was asked to distinguish between a neoessary conclusion and the 
negation of a necessary conclusion. Hill contended that these items bested 
£or hypothetioo-deductive reasoning which is a crucial characteristic of 
formal thought. 

However, O'Brien and Shapiro (1968) contended that her items were :iot 
content valid for her items did not, in addition, demand that the pupil test 
the logical necessity of a conclusion. They determined that though young- 
sters between 6 and 8 years of age are able to discriminate between a neces- 
sary conclusion and its negation they are unable to test the logical neces- 
sity of a conclusion. Thus they concluded that hypotJntico*doductive 
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reasoning ability cannot at all bo attributed to young children. Logic 
iters as fonral thought measures have so far manifested only modest con- 
tent validity. 

In general, there presently exist no formal thought instruments that 
have been extensively validated. The research reported here deals with a 
construction and validation of formal operational reasoning instruments. 

II. Plan of the Study 
A. Sub j ects 

A sanple of ninety adolescents from Chicago area schools was used in 
this study. Thirty scholastically above-average students were selected 
from each of the three age levels: 13 years of age, 16 years of age, and 

19 years of age. Since the stage of formal operations was examined, it 
was assuned that most pupils over 13 years of age inclusive and with above- 
average scholastic achievement would have formal operational capabilities 
according bo the Piagetian finding that formal operations develop in Swiss 
children during the 12-15 year period of life (Inhelder and Piaget, 1958) . 
There were 32 males and 58 females in the sanple. This oondition of dis- 
proportionate sex sampling should not detract from the results of the study 
for evidence has accumulated that there are no sex effects with respect 
to formal operational skills (e.g., Stone, 1966; O'Brien and Shapiro, 1968) 

B. Ifistrunents 

1. Pi a getian Tasks. 

To test for attainment to the stage of formal operations four formal 
operations tasks devised by Piaget and his as^uCiates were employed. The 
specifications and testing procedures required for these * Cwks have been 
elaborated by Inhelder and Piaget (1958), by Lose 11 (1961), and by Hughes 
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(1965) ; tho specifications and testing procedures complied with in this 
study. 

Hie following four tasks have been detenu ned by Piaget and his asso- 
• dates to test for fonaal. operational ability: 

(a) the Oscillation of a Pendulum task which tests for the operations 
• of exclusion; 

(b) the Conservation of Motion on a Horizontal Plane task which tests 
for the conservation of motion concept; 

(c) the Equilibrium in the Balance task which tests for the under- 
standing of the physical principle of a balance; 

(d) the Projection of Shadows task which tests for understanding of 
the physical principle relating the size of a shadow to the size 
of an object projected and to the distances of the object from 
the light source and from the surface of the shadow. 

The plane and pendulum tasks require' the experimental manipulation of 
variables to oonfirm certain hypotheses. The balance and shadow tasks 
• require the discovery of such relations as proportionality and reciprocity 
in physical systems. 

These four tasks were administered to each subject in the sanple and 
the response to each task was given a rating of one of the stage levels 
(1, II-A, and II-B, and III-A and III-B) used by Piaget and his associates 
to grad* these tasks. The protocols used in this study were strictly com- 
plied with in giving tiie ratings. A rating of IIX-A or III-B was given 
to concrete operational responses; a rating of I was given to preopc ra- 
tional responses. 
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2. Formal Opera tionaj. Re asoning I nstruments. 

These instruments are measures of formal operational thought. Each 
adolescent was administered chree of these instruments : three formal opera- 

• tional reasoning instrurrents set in the oontent areas of biology, litera- 
ture, and history respectively. 

Piaget (1963) contends that the ability to accept absurd premises 
(e.g., there was a dog with six heads) as such and to reason from these 
premises in a purely deductive manner is formal operational. Morf (1957) 
states that the ability to reason deductively from a set of premises in 
which unary and binary connectives (e.g., if . . . then) is also formal 
opera tional . Hie foilwing item is an example of formal operational rea- 
soning devised by Morf (1957) : 

I think, of an animal. If the animal has long ears, it may be either 
an ass or a mule. If my animal has a big tail, it is either a mule or a 
horse. Nov, I want an animal with both long eats and a big tail. What 
can it be? 

Thus, any verbal or written item that require? an acceptance of a 
set of absurd premises in which unary and binary connectives are used and 
that requires a deduction problem to be solved based on the absurd premises 
is a valid test of formal operational thought according to the considera- 
tions of Piaget and of Morf. This general specification for the formal 
operational re as curing instnments was adhered to in the construction of many 
items. Ti»o following specifications were ccnplied with in the item construc- 
tion! 

i. Each item has either absurd (oontrary-to-fact) declarative premises or 
imaginary declarative premises — i.e., each premise in each item mist be 
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either contrary to a fact that the subject knows to be a fact or must be 
an imaginary statement which has no concrete referents and is beyond the 
experience of the subject. An example, of a contrary- to- fact premise is 
the following: William Shakespeare wrote Tom Sav/yer , but he did not write 

Hamlet . An exanple of an imaginary premise is the following: Unicorns 

travel only in pairs. The specification referring to absurd premises is 
attributable to Jean Piaget (1963) and the specification referring to 
imaginary premises is derivable from the discussion on formal thought by 
Flavell (1963) . 

ii. Each item has unary ar.d binary connectives (e.g., "if . . . then," 

"but", "and", "not", "neither . . . nor") being used in the premises. An 
exanple of such a premise in which the binary connective "but" will be used 
is the following: William Shakespear wrote Tom Sawyer , but he did not write 

Hamlet , “ibis specification is attributable to Albert Korf (1957) who is 

an associate of Piaget. 

iii. The task for each item requires a simple deduction through the use of 

* % 

either propositional rules of inference or quantifications 1 rules of infer- 
ence in order for the validly doducible response to be recognized. An 
exanple of such an item with the content area being biology and the primary 
of inference to be used being modus tollens is the following: 

A. If butterflies can swim, then butterflies have gills. 

Butterflies do not have gills, but they have fins. Therefore . . . 

a. All butterflies can swim. 

b. Either butterflies s.Jim or they have gills. 

c. If butterflies fly, then they swim. 

d. Butterflies cannot .swim. v 

e. Seme butterflies have no fins. 

f. Butterflies swim but have no fins. 



rule 
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* 

The oorrcct response to this item is d. 

This specification is attributable to both Piaget (1963) and Morf (1957) . 

iv. It is indicated to the student in each test that he is to assume that 

, the premises for a given item are true. This specification is attributable 
to Piaget (1963) . 

v. Each item in the tests is of the multiple-choice type with six choices. 

Items were constructed so that each required thought prooesses that 
are appropriate to the stage of fonnal operations but not appropriate to 

i 

\ 

the, stage of concrete operations — i.e., each item to be used must amply 

\ 

to specifications i-iv for formal operational reasoning tests that ha' r e 
heretofore been stated and that are attributable to Morf and Piaget. Two 
• high school teachers were chosen and trained in the appraisal of items as 
conforming to the specif ications cited. If they both agreed that the items 
chosen ocnplied to the specifications cited, then content validation of 
the items will have been achieved. There was no disagreement between the 
two raters for they both agreed that the items complied to the specifications 
cited; thus, a content validation of the items was achieved. 

The observation that there was no disagreement between the raters as 
to the compliance of the items to the specifications may seam extraordinary. 
However, \jpon a closer examination of the procedure used for content vali- 
dation this observation ray appear to be more reasonable. First of all, both 
high school teachers vho were raters had soma familiarity with sytrbolic logic 
to the extent that they knew basic logical rules of inference and knew tl*> 
English notation used in syirboli 2 ing certain logical statements (e.g., the 
statement "John went to tlx? store and Mary went heme" could be symbolized 
as "p.q. " x^xore p refers to "Jcixn went to the store" , q refers to "Mary 
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went home", and refers to. "and"). .Their familiarity with symbolic 
logic proved to be an aid in their scrutiny of the test items. 

The items were constructed and a statement of the specifications 
were provided to each rater. Each rater was asked to examine each item 
and to identify any item not complying to the item specifications. First 
of all, each item was found to have either absurd or imaginary premises 
by each rater. Secondly, each item was examined and found to have unary 
and/or binary connectives (e.g., "not", "either . . . or", "and") being 
used in the item by each rater. Thirdly, each rater determined that each 
item required some logical rule of inference to be used for its correct 
resolution; in this phase the teachers used their familiarity with symbolic 
logic to ascertain item compliance to specification iii. Lastly, each 
rater indicated that the direction for the items made it sufficiently clear 
that the premises in each item were to be assumed to be true by each subject. 
Dae partly to the prior training in symbolic logic of the raters and due 
to the pointedness, simplicity, and clarity of the item specifications the 
raters were able to scrutinize the items for possible compliance to the 
item specifications and determined that the items had content validity. 

After a set of content valid items were constructed, a pre-testing of 
the items was enacted to determine those items with high-point biserial 
correlations, high factor loadings with the first principal component, 
and item difficulties in the .10 - .90 range. A subset of about 30 items 
was selected from those items to form the basic items in the instrument. 
Versions of the test were constructed maintaining the formal logical struc- 
tures of the items but set in the three content areas — biology, history, 
and literature. The testing time for each of the tests was about forty 
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minutes . 

The formal logical structures of the thirty items chosen were set in 
terms of the Polish notation of symbolic logic developed by J. Lukaseiwicz 
a noted Polish logician. These thirty items were set in various contents 
.in the pre-test of sixty items. It may be noted that the correct response 
for any given item forms the only consistent and valid forraila with the 
premises given. The other five responses for any given item relate to 
inconsistent formulae that are invalid for some choice of truth values 
for the constituent premises determines true composite premises and false 
conclusion responses. After the thirty items were selected from the pre- 
test, the items were randomly ordered to form the standard ordering of 
items in the formal operational reasoning instruments. 

Hie measure of competency with formal operations is the mean score 
(mean number of items answered correctly) for the three reasoning scores; 
this score is the formal operational competency score . An individual is 
adjudged to be capable of formal operational thought if he obtains a for- 
mal operational competency score Greater than the upper 95% confidence 
limit for the guessing score (n/6 + */5n/3 where n is the number of items 
in the test — Gulliksen , 1950) for the three reasoning scores for that indi 
vidual . 

3. Measure of General Intellectual Ability . 

A measure of verbal intelligence entitled the Experimental Omnibus 
Vocabulary Test developed by Frederick Davis is the measure of general 
intellectual ability used in this study. The forty items used in this 
test were selected frcm a larger sample of vocabulary items as conforming 
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to a unidimensional model of verbal intelligence. In part, the test scores 
were used in the construct validation procedure for the three formal opera- 
tional reasoning instruments. 

lhe three formal reasoning tests and the vocabulary tests were adminis- 
tered to the ninety adolescents in group-testing settings and the four 
Piagetian tasks were administered to each adolescent individually. 

III. Results 

I 

i A test has content validity if the items in the test require behaviors 

1 . 

for their successful resolution that are proper to the trait being measured 

\ 

(Croribach, 1960). The three formal reasoning tests were found to have con- 
.tent validity. For example, the reasoning tests were found to have all of 
their items fulfilling the Genevan specifications i-iv for formal opera- 
tional reasoning tests heretofore cited. 

A test has concurrent validity if the test correlates highly positively 
with direct tests measuring the same trait as the initial test (Cronbach, 

1960) . Concurrent validity of the formal reasoning tests was to be deter- 
*\ 

mined in two phases. Ihe first phase entails the examination of the cor- 
relations between the formal operational task scores and each of the sets 
of formal reasoning scores and the total reasoning scores: Table 1 depicts 

these correlations . 

Table 2 indicates the lower bounds of the correlations corrected for 
attenuation between the total task scores and the four formal reasoning 
scores cited in Table 1. Ihe correlations in Table 2 are lower bounds for 
in the computation of these correlations it v/as assumed that the reliability 
of the composite task was 1.00. Thus the coefficient indicating the relation 
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be tween the set of four tasks and a given formal reasoning test conditional 
on the usage of perfectly reliable instruments is at least the correspond- 
ing correlation designated in Table 2. 

From an examination of the correlations in Tables 1-2 it can be stated 
that the relation between the formal reasoning scores and the total task 
scores for four Piagetian tasks is moderate even if perfectly reliable 
instruments are used. 

Being not uniformly high the correlations in Table 1 lend weight to 

the contention that the formal reasoning scores are moderately related to 

the total task score for the four Piagetian tasks used. Thus the first 

phase of the concurrent validation has provided information attesting to 

the modest concurrent validity of the separate formal reasoning tests. 

However, when these tests are combined, the concurrent validity (with or 

without attenuation) is relatively high. 

The second phase of concurrent validation entails the examination of 

a contingency table relating the placement of individuals into cognitive 
*\ 

stages according to their formal operational task scores to the placement 
of individuals into oognitive stages according to formal operational com- 
pentency scores. No adolescent subject was found to be at the pre -opera- 
tional stage of thought according to their Piagetian task performances. 

The formal reasoning tests can only specify formal thought capabilities 
from non- formal thought capabilities. All adolescent subjects were placed 
at either the concrete stage of thought of the formal stage of thought. 

Individual task scores greater than three were judged to be formal 
operational and thus formal operational task scores greater than 12 were 
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classified as formal operational. Subjects with task scores in the 5-12 
range were classified as concrete operational. The upper 95% oonfidenoe 
limit for the guessing score for the three reasoning tests is 9.1 which 
equals n/6 -I- V*5n/3 where n=30 (Gulliksen, 1950). Subjects receiving can* 
petency scores greater than 9.1 were classified as concrete operational. 
Adolescent subjects receiving competency scores less than or equal to 9.1 
were classified as concrete operational. Table 3 indicates the placement 
of the 90 subjects into the two highest cognitive steps acoording to the 
Piagetian tasks and acoording to the formal reasoning tests. 

As can be seen in Table 3, 86 subjects were found to be at the stage 
of formal operations according to the two sets of measures. Also 95.5% 
of the adolescent subjects were adjudged to be at the stage of formal 
operations by both methods of stage measurement. The capacity of the 
set of four Piagetian tasks to measure subjects at the stage of formal 
operations is to a great extent shared by the set of formal reasoning tests. 
However, this phase of the concurrent validation remains inconclusive for 
the alleged capacity of .the formal reasoning tests to distinguish subjects 
at the stage of formal operations was not verified in this study. Hope- 
fully,^ the future a wide variety of subjects could be chosen and tested 
with the tasks and reasoning tasks and then could be classified into cog- 
nitive steps according to their respective sets of responses. The classi- 
fications according to the task scores and according to their respective 
sets of responses. The classifications acoording to the task scores and 
acoording to the test scores could be examined and compared with the use 

v. 

of a contingency table and then the discriminative quality of the formal 
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reasoning tests could be determined. In this manner the p! ■ ;e of ttie con- 
current validation of the formal reasoning tests could be accomplished. 

It may be noted that no capacity of the formal reasoning tests to classify 
subjects ir.to lower cognitive stages has been acknowledged, thus the clas- 
sificatiory range of the Piagetian tasks is recognized as being greater 
than the classifies tory range of the formal reasoning tests. 

Construct validity of the formal reasoning tests was determined in 
three phases. The first phase involved some of the techniques of conver- 
gent and discriminant validation proposed by Campbell and Fiske (1959) on 
the examination of a multitrait-multiirethod matrix. Convergent validity 

i 

of a set of test measuring a given trait is demonstrated if the tests have 
high positive correlations with other tests employing a different method 
measuring the same trait. It was hypothesized that the correlations between 
the formal reasoning tests and the Piagetian tasks will be large and posi- 
tive thus indicating the convergent validity of the formal reasoning tests. 
Discriminant validity of a set of tests measuring a given trait is demon- 
strated if the test have small correlations with measures of a different 
trait but employing a similar method. It was thus hypothesized that the 
correlations between the three formal reasoning tests and the measure of 
verbal intelligence will be quite low, thus indicating the discriminant 
validity of the formal reasoning tests. 

Tables 4-7 indicate the intercorrelations among the eight constituent 
cognitive variables used in this study for each of the three age levels and 
the total sample, t'ost of the correlations in the rectangular sub-matrices 
with the dotted lines ere modestly significant and positive, thus attesting 
to the limited convergent validity of the formal reasoning tests; the 'formal 

' Q ■ 
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reaconing tests are, in general, moderately correlated with the Piagetian 
tasks. 

The rectangular sub-matrices with the solid lines in Tables 4-7 relate 
to the discriminant validation aspect of this phase for they indicate the 
correlations between the measure of verbal intelligence and the three con- 
stituent measures of formal thought. Most of these values are modestly 
significant. In addition, these correlations are, in general, of the same 
magnitude as the correlations indicating convergent validity of the formal 
reasoning tests. These conditions indicate that the formal reasoning tests 
have little if no disciminant validity for two reasons: (1) the formal 

reasoning tests have modest, not small, as hypothesized, correlations with 
a measure of a different trait (i.e., verbal intelligence) as they are 
with measures of the sane trait (i.e., formal thought). 

The seoond phase of the construct validation entailed a scrutiny of 
the age level means and standard deviations for the vocabulary test scores 

and the formal reasoning test scores. Table 8 indicates this information. 

* \ 

It was hypothesized that the means of the vocabulary test scores will indi- 
cate a decided positive monotone trend, whereas the formal reasoning test 
score means increase fron the 13 year age level to the 16 year age level 
but then level off and show no significant increase from the 16 year age 
level to the 19 year age level. That hypothesis is derivable from the 
observations that verbal intelligence (e.g. , vocabulary size) continues 
to grow well into adulthood (Guilford, 1967) but that formal reasoning 
beoares well established in a relatively short period of time after its 

s 

emergence (Inhelder and Piaget, 1958) . . 
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As can be discerned in Table 8 there is a definite positive monotone 
trend among the voceibulnry test means, whereas there is no significant increase 
for any of the formal reasoning test measures between the 16 year age level 
‘ and the 19 year age level. The statistics in Table 8 indicate that as age 
increases the vocabulary size tends to increase and becomes more varied, thus 
substantiating an aspect of the hypothesis being considered. Also, statistics 
in Table 8 indicate that between 18 and 19 years of age formal operational 
skills become somewhat fixed. These two trends in the scores discernible in 
Table 8 attest to the hypothesis that. formal operations and verbal intelli- 
gence comply to too different growth patterns. 

The third phase involved the examination of some of the correlations and 
partial correlations among the total task scores, the formal operational com- 
petency soores, and the vocabulary scores. It was hypothesized that verbal 
intelligence is not the primary component in the formal operational relation- 
ship between the Piagetian tasks and the formal reasoning tests. This hypo- 
thesis was confirmed if the partial correlation between the formal operational 
competency scores holding vocabulary scores constant was similar to the cor- 
relation between the formal operational task scores and the formal operational 
ocmpeteny soores . 

A second aspect of this third phase designated tile hypothesis that little 
remains of the relationship between verbal intelligence and formal reasoning 
when the Piag. lan formal task component is removed. This was confirmed if 
the partial correlation between the vocabulary scores and the formal opera- 
tional competency scores holding formal operational task scores constant was 
appreciably less than the correlation between the vocabulary scores and the 
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formal operational competency scores. Table 9 indicates the four correla- 
tions relevant to this phase of the construct validation. 

From the correlations cited in Table 9 it can be determined that the 
measures of verbal intelligence accounts for only 27.3% of the variance 
shared by the Piagetian tasks and the formal reasoning tests and that the 
two correlations to be considered in the first aspect of the third phase 
are quite similar as hypothesized. Also, the partial correlation between 
the vocabulary scores and the formal operational ccsrpetency scores holding 
formal operational task scores constant (.159) is appreciably less than the 
correlation between the vocabulary scores and the formal operational compe- 
tency scores (.358) as hypothesized. Thus this phase of the construct vali- 
dation of the formal reasoning tests provides evidence attesting to the con- 
struct validity of the formal reasoning tests. 

To summarize the validation procedure findings, the formal operational 

reasoning i/'sts * -x lemons trated to have substantial content validity, 

modest concurrent validity and limited construct validity. 

* % 

From an examination, of certain statistical and psychomatric properties 
of the tests used in the study certain findings on the item structures can 
be stated. Five item structures had relatively high average validity indices 
(in excess of .100), high average reliability indices (in excess of .130), 
relatively high first principal component factor loadings, and moderately 
high item difficulties (in the .500 - .760 range). The high average validity 
indices of these item structures, for example , indicated that the performance 
on any item with any of these item structures is highly related either with 
performance on the four Piagetian tasks used or with the formal operational 
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competency score. No discernible pattern was evident among these item struc- 
tures for four of the item structures were drawn from the quantificational 
logic and one item structure was drawn from the logic of relations. 

More research is needed to determine those logical components (e.g. , 
presence of connective "not") in a formal reasoning item with an item struc- 
ture responsible for the validity and difficulty of the item. Also research 
is needed to determine whether items with more abstract content or with more 
conplex constituent sentences are more difficult than other items set in 
different contents b it with the same formal logical structure. 

In general, highly reliable formal reasoning tests with items having 
high reliability and validity indices form an objective for formal reasoning 
test construction and would provide more accurate and valid measures of for- 
mal thought. Also formal reasoning tests with items complying with a fac- 
torial design with types of logical components (e.g. , binary connectives 
such as "and") designating the factors used could be used to determine those 

qualities of the items that would influence item discrimination and item dif~ 

' ^ 

ficulty. 

IV. Discussion 

The attorrpt to construct and validate paper-and-pencil formal operations 
tests was somewhat successful. The formal reasoning tests developed in this 
study could be used to determine the level of formal cognitive functioning 
for each adolescent in a school. However, valid, "pure" paper-and-pencil 
Treasures of formal thought were not developed; more research is needed to 
resolve the methodological problems in tire construction of such a valid, 
"pure" instrument. However, the set of procedures enplcyed in this study 

ERIC 
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would provide a reasonable basis from which such ins trument development could 
take place. Valid, "pure" formal thought tests would have considerable value 
for not only the practical purposes of measuring level of cognitive develop- 
ment but also for instituting fertile psychological research in formal thought 
and adult cognition. 

In general, there still presently exist no strictly validated instruments 
testing for formal, operations. With validated formal operations tests easy 
to administer, educators could determine those students capable of higher cog- 
nitive functioning. In addition, a standardized developmental scale of rea- 
soning battery consisting of paper-and-pencil instruments that measure formal 
thought capabilities and other paper-and-pencil instruments that test for 
other Piagetian stage behavior patterns could be used extensively and inex- 
pensively to determine the level of cognitive development of each member of 
the school population and to diagnose the cognitive inabilities of the mentally 
retarded. Thus research on the measurement of operational thinking has not 
only a theoretical relevance but also extensive practical ramifications. 

* r 

It is quite possible that the evaluation of the subject matter achieve- 
ment and the measurement of the cognitive development will be unified through 
test construction from a Piagetian framework. For example, tests demanding 
the same set of operational skills may be set' in various content areas to 
test for the generalizability of the operational skills and to test for achieve- 
ment in the content areas. Thus psychologically-parallel achievement tests 
could be constructed that would indicate the level of cognitive development 
and the achievement of an individual for a set of content areas . It is anti- 

• s 

cipated that these contentions may have considerable effect on measurement 
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and evaluation practices in schools. 
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Table 1 

Pioduct-Mcrnerit Correlation Coefficients Between Formal Operational Task Soores 
and Each of the Three Sets of Formal Operational Reasoning Scores 





and the Total Formal Reasoning Scores 




Piagetian Tasks 


i 

Formal Operational Reasoning Tests 

! • i 


Total 




i 

1 

1 


i 


Soore 


Age- level 


| Biology 


History Literature | 




13 year old 


| . 220 


. .096 .404 • ! 

1 

1 


►333 

% 


16 year old 


| .513 


.593 .523 j 

1 


,624 


19 year old 


’ .477 


.484 1 .515 : 


.571 









o 
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Table 2 



lower- bounds for Pearson Product-Moment Correlation Coefficients 
Corrected for Attenuation Between Formal Operational Task Soores 
and Each of the Ihree Sets of Fomval Operational Reasoning Scores 
and the Total Formal Reasoning Scores 



Piagstian Tasks Formal Operational Reasoning Tests ] Total 

j 

. Score 



Age-level Biology History Literature 

^ - r - . * ! ; , 

Total Group \ .566 ; . .535 .630 i .792 

' . __ J .. . . _ i .... . . . . . ..... 

i* . 

\ 



*% 



O 
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Table 3 

Contingency Table Indicating the Number of Subjects at the Concrete 
Stage of Thought and at the Formal Stage of Thought According to 
the Task Scores and According to the Competency Scores 



Formal Reasoning Tests 



Concrete Formal 



Totals 



Concrete • ’ 0 




4 



Piagetian Tasks * J ! 

■ i • j ! : 

S ; Formal 0 i 86 86 

l . . 

l . | Totals 0 ! 90 90 

i « I 



O 
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Table 4 

Lcwer-Triangular Correlation Matrix for the Seven Measures of Formal Thought 
and the Measure of Verbal Intelligence for the 13 Year Old Group 

8 



Variable 


Y' 


....... j 


3* ’ 


4 


5 : 


6 


1 

. 7 


1 . Shadows Task 




1 




: i 


4 

\ 




: ! 
1 


2. Balance Task 


.746 






1 | 

» 


1 

f 




* 


3. Pendulum Task 


.379 


.566 




< 

1 

| 


i 




| 


4 . Coaservation Ta^k 


.368 


.609 


.377 


1 


• 

i 




! 

. i 


5. Biology Test 


[.342 


.249 


-.103 


.191 1 






: | 


6. History Test 


1 

,.209 


.070 ‘ 
| 


-.052 


.069! 
! 1 


.562 




* 


,7. Literature Test 


,.253 


.474 


.286 


1 t 1 

.261, 


.504 


.383 


* ; 


8. Vocabulary Test 


.133 


.278 


-.086 


.072 


.101 


• 

o 

o 

•C* 

> 


.384 






- -V %• 




...... . ; •; 


v- - 


> 


- . o*. . r* 
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Table 5 

Lpwer-Triangular Correlation Matrix for the Seven Measures of Formal Thought 
and the Measure of Verbal Intelligence for the 16 Year Old Group 
Variable 1 



1 . 


Shadows Task 




1 

! 


1 

l 








2. 


Balance Task 


.817 


i 

i 


i 


, i 


• 




3. 


Pendulum Task 


.682 


.609 










4. 


Conservation Task 


.552 


.596 


.528 








5. 


Biology Test 


| .330 


.458 


.500 


.503 • 
| 






6. 


1 

History Test 
1 


1 .524 
| 


.538 


.533 


. .415 | 

a 


.626 




7. 


Literature Test 


1 .460 


.499 


.393 


.429 | 


.641 


.582 


•8. 


Vocabulary Test 


.316 


.428 


.425 


.551 


.202 


.130 -.019 



O 
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Table 6 

Icts'er-Triangular Correlation Matrix for the Seven Measures of Formal Thought 
and the Measure of Verbal Intelligence for the 19 Year Old Group 



Variable 


1 S 


2 


3 


L 4 


5 


6 


7 


1. Shadows Tcsk 






‘ i 

1 


i 

1 








2. Balance Task 


.528 






i 






! 

1 

| 


3. Pendulum Task 


.530 


.487 












4. Conservation Task 


.638 


.325 


.261 










5. Biology Test 


1 .315 


.596 


.337 


.229! 

| 


i 


r 




6. History Test 


1.448 


.552 


.122 


1 

.3301 

1 


.460 






7. Literature Test 


' 1.390 


.558 


.319 


.318 1 

l 


.699 


.664 




8. Vocabulary Test 


.228 


.431 


.196 


.171 


.407 


.248- 


.426 



« 



ERIC 



-28- 



Table 7 

Lewer-Triangular Correlation Matrix for the Seven Measures of Formal Thought 
ajxl the Meeisure of Verbal Intelligence for the Total Sample 



Variable 1 2 



1. 


Shadcws Task 




J 


2. 


Balance Task 


.731 




3. 


Pendulum Task 


.560 


.594 


4. 


Conservation Task 


.511 


.538 


5. 


Biology Test 

i 


1.379 

1 

'.396 

1 


.460 


6. 


History Test 

! 


.388 


7. 


i • 

Literature Test 


j .416 


.536 


8. 


Vocabulary Test 


.160 


.405 



3 ; 

i 

1 


4 

i , 

r ! 


5 


6 

1 


7 


9 

.448 


i | 

r 




j 


• 


.328 


.327 j 








.221 


.269 [ 


.508 






.458 


.375* 


.654 


.461 




.376 


.357 


.325 


.134 


.394 




% 




-29- 



Table 8 



Statistics of Five Cognitive Test Scores Over Tivree Age-Levels 



J 




Experimental 


Biology 


History 


Literature 


Mean 


t 




Gmibus 


Formal 


Formal 


Formal 


4 

Formal 






Vocabulary 


Thought 


Thought 


Thought 


Operational 


Age 


Statistic 


Test 


Test 


Test 


Test 


Conpetency 


Level 




Score 


Score 


Score 


Score 


Soore 


i3 i 

\ 

Year 


mean s.d. 


15.733 


17.366 


18.466 


15.000 


16.940 




4.016 


3.633 


2.661 


4.961 


3.059 


i\ 

Old 1 


i 




« 


. 






16 • j 


mean s.d. 

t 


18.100 


20.200 


18.366 


19.733 


19.440 


Year 


4 

1 

1 


4.830 


3.325 


3.652 


3.694 


3.068 


Old 


I 












19 


mean s.d. 


24.166 


19.933 


18.666 


19.766 


19.460 


i 

Year 


J 

i 


5.337 


4.016 


3.325 


3.710 


3.166 


Old 


» 

* 




1 

* 


1 




} 


Total 


mean s.d. 


19.333 


' 19.166 


18.500 


18.166 


18.613 

| 


Sarrple 


I 


5.907 


| 3.848 


3.205 


4.693 


3.286 



ERLC 



-30- 



Table 9 

Some Correlations Among Cognitive Measures for Total Sanple 

1. Correlation between formal operational competency scores 

and formal operational tiisk scores ’ *=■ . 564 

2. Partial correlation between formal operational competency 
scores and formal operational task scores holding vocabulary 

I 

l 

scores constant I e .488 

^ _ ^ ^ ^ { 

3. Correlation between formal operational competency scores and 

vocabulary scores j e .358 

4. Partial correlation between formal operational oorpetency 
scores and vocabulary scores holding forma) operational task 

scores constant c .159 



0 
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